pacman::p_load(sf, spNetwork, tmap, tidyverse)In-class Exercise 3: Network Constrained Spatial Point Patterns Analysis
1 Overview
Network Constrained Spatial Point Patterns Analysis (NetSPAA) is a set of methods specifically designed to analyze spatial point events that occur on or alongside a network, such as traffic accidents or childcare centers along road or river networks.
In this hands-on exercise, we explore how to use the spNetwork package to:
Derive network kernel density estimation (NKDE).
Perform network G-function and K-function analysis.
2 The Data
In this study, we aim to analyze the spatial distribution of childcare centers within the Punggol planning area. To achieve this, we will utilize two specific geospatial datasets:
Punggol_St: consists of line features representing the road network within the Punggol planning area.
Punggol_CC: comprises point features indicating the locations of childcare centers within the same area.
Both datasets are provided in the ESRI shapefile format.
3 Install and launch R packages
Below code chunk install and launch the four R packages
4 Data Import & Preparation
st_read() of sf package is used to import Punggol_St and Punggol_CC geospatial data sets as sf data frames using below code chunk.
network <- st_read(dsn="data/geospatial",
layer="Punggol_St")Reading layer `Punggol_St' from data source
`C:\thuphuong1110\ISSS626-GAA\In-class_Ex\In-class_Ex03\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 2642 features and 2 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 34038.56 ymin: 40941.11 xmax: 38882.85 ymax: 44801.27
Projected CRS: SVY21 / Singapore TM
childcare <- st_read(dsn="data/geospatial",
layer="Punggol_CC") %>%
st_zm(drop = TRUE,
what = "ZM")Reading layer `Punggol_CC' from data source
`C:\thuphuong1110\ISSS626-GAA\In-class_Ex\In-class_Ex03\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 61 features and 1 field
Geometry type: POINT
Dimension: XYZ
Bounding box: xmin: 34423.98 ymin: 41503.6 xmax: 37619.47 ymax: 44685.77
z_range: zmin: 0 zmax: 0
Projected CRS: SVY21 / Singapore TM
The below code chunk below can be used to print the content of network and childcare simple features objects.
childcareSimple feature collection with 61 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 34423.98 ymin: 41503.6 xmax: 37619.47 ymax: 44685.77
Projected CRS: SVY21 / Singapore TM
First 10 features:
Name geometry
1 kml_10 POINT (36173.81 42550.33)
2 kml_99 POINT (36479.56 42405.21)
3 kml_100 POINT (36618.72 41989.13)
4 kml_101 POINT (36285.37 42261.42)
5 kml_122 POINT (35414.54 42625.1)
6 kml_161 POINT (36545.16 42580.09)
7 kml_172 POINT (35289.44 44083.57)
8 kml_188 POINT (36520.56 42844.74)
9 kml_205 POINT (36924.01 41503.6)
10 kml_222 POINT (37141.76 42326.36)
networkSimple feature collection with 2642 features and 2 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 34038.56 ymin: 40941.11 xmax: 38882.85 ymax: 44801.27
Projected CRS: SVY21 / Singapore TM
First 10 features:
LINK_ID ST_NAME geometry
1 116130894 PUNGGOL RD LINESTRING (36546.89 44574....
2 116130897 PONGGOL TWENTY-FOURTH AVE LINESTRING (36546.89 44574....
3 116130901 PONGGOL SEVENTEENTH AVE LINESTRING (36012.73 44154....
4 116130902 PONGGOL SEVENTEENTH AVE LINESTRING (36062.81 44197....
5 116130907 PUNGGOL CENTRAL LINESTRING (36131.85 42755....
6 116130908 PUNGGOL RD LINESTRING (36112.93 42752....
7 116130909 PUNGGOL CENTRAL LINESTRING (36127.4 42744.5...
8 116130910 PUNGGOL FLD LINESTRING (35994.98 42428....
9 116130911 PUNGGOL FLD LINESTRING (35984.97 42407....
10 116130912 EDGEFIELD PLNS LINESTRING (36200.87 42219....
Note : spNetwork’s function expects the geospatial data to contain complete CRS information.
5 Visualize Geospatial Data
Before diving into the analysis, it’s beneficial to visualize the geospatial data. There are at least two methods to achieve this. One approach is to use the plot() function from Base R, as demonstrated in the code snippet below.
plot(st_geometry(network))
plot(childcare,add=T,col='red',pch = 19) # add = True: overlay the points on the currently open window screen
To visualise the geospatial data with high cartographic quality and interactive manner, the mapping function of tmap package can be used as shown in the code chunk below.
tmap_mode("view")tmap mode set to interactive viewing
tm_shape(childcare) +
tm_dots(col = "red") +
tm_shape(network) +
tm_lines()tmap_mode("plot")tmap mode set to plotting
6 Network KDE (NKDE) Analysis
In this section, we perform NKDE analysis using appropriate functions from spNetwork package.
6.1 Prepare the lixels objects
Before computing NKDE, the SpatialLines object needs to be divided into lixels (line pixels) with a specified minimum distance. This can be done using the lixelize_lines() function from the spNetwork package, as shown in the code chunk below.
lixels <- lixelize_lines(network,
700,
mindist = 350)The followings are observed from the above code chunk.
The length of each lixel (lx_length) is set to 700 meters. This is a reasonable walking distance for parents/grandparents to walk their kids to the childcare centre.
The minimum length of a lixel (mindist) is set to 350 meters.
After cutting, if the length of the final lixel is shorter than the minimum distance, it is merged with the previous lixel. If mindist = NULL, then it defaults to maxdist/10. Segments that are already shorter than the minimum distance are left unmodified.
Note: The lixelize_lines.mc() function offers multicore support for this process.
6.2 Generate line centre points
Next, lines_center() of spNetwork is used to generate a SpatialPointsDataFrame (i.e. samples) with line centre points as shown in the code chunk below.
samples <- lines_center(lixels) The points are located at center of the line based on the length of the line.
Plotting the lixel and sampling points
tmap_mode("view")tmap mode set to interactive viewing
tm_shape(lixels) +
tm_lines() +
tm_shape(samples) +
tm_dots(size = 0.01)tmap_mode("plot")tmap mode set to plotting
6.3 Perform NKDE
The NKDE is computed using below code chunk.
densities <- nkde(network,
events = childcare,
w = rep(1, nrow(childcare)),
samples = samples,
kernel_name = "quartic",
bw = 300,
div= "bw",
method = "simple",
digits = 1,
tol = 1,
grid_shape = c(1,1),
max_depth = 8,
agg = 5,
sparse = TRUE,
verbose = FALSE)Note: Some of the key arguments given above code chunk:
The kernel_name argument indicates that the
quartickernel is being used. Other kernel methods supported by spNetwork include: triangle, gaussian, scaled gaussian, tricube, cosine, triweight, epanechnikov, or uniform.The method argument specifies that the
simplemethod is used for NKDE calculation. There are three supported methods in spNetwork:method=
simple: Proposed by Xie et al. (2008), this method adapts the kernel formula for network distances and calculates density over a linear unit instead of an areal unit.method=
discontinuous: Suggested by Okabe et al. (2008), this method divides mass density equally at intersections of lixels.method=
continuous: Also proposed by Okabe et al. (2008), this version adjusts the density before intersections to create a continuous function while still dividing mass at intersections.
It is recommended to read the spNetwork package user guide for a deeper understanding of the various parameters available for calibrating the NKDE model.
6.4 Visualize NKDE
Before visualizing the NKDE values, the code chunk below inserts the computed density values (densities) into the samples and lixels objects as a new density field.
samples$density <- densities
lixels$density <- densitiesSince svy21 projection system is in meter, the computed density values are very small (e.g., 0.0000005). The code chunk below rescales the density values from events per meter to events per kilometer.
# rescaling to help the mapping
samples$density <- samples$density*1000
lixels$density <- lixels$density*1000The code below uses appropriate functions of tmap package to prepare interactive and high cartographic quality map visualisation.
tmap_mode('view')tmap mode set to interactive viewing
tm_shape(lixels)+
tm_lines(col="density")+
tm_shape(childcare)+
tm_dots()tmap_mode('plot')tmap mode set to plotting
The interactive map above effectively reveals road segments with relatively higher density of childcare centres (darker color) than road segments with relatively lower density of childcare centres (lighter color).
7 Network Constrained G- and K-Function Analysis
In this section, we perform complete spatial randomness (CSR) test using kfunctions() of spNetwork package. The null hypothesis is defined as:
H0: The observed spatial point events (i.e distribution of childcare centres) are randomly distributed over a street network in Punggol Planning Area.
The CSR test assumes a binomial point process, meaning the centres are randomly and independently distributed. If rejected, it indicates the centres are spatially dependent and form nonrandom patterns.
kfun_childcare <- kfunctions(network,
childcare,
start = 0,
end = 1000,
step = 50,
width = 50,
nsim = 49,
resolution = 50,
verbose = FALSE,
conf_int = 0.05)Note: 9 key arguments used in above code chunk:
lines: A feature collection of linestrings representing the underlying network. The geometries must be simple Linestrings
points: A
sfdata frame representing points on the network. These points will be snapped on their nearest line.start: A double, the start value for evaluating the k and g functions.
end: A double, the last value for evaluating the k and g functions.
step: A double, specifying the interval between evaluations of the k and g functions.
width: The width of each donut for the g-function.
nsim: An integer for the number of Monte Carlo simulations. typically, more than 50 simulations are needed for inference.
resolution: A value to reduce calculation time when simulating random points by splitting edges and selecting vertices.
conf_int: A double for setting the confidence interval (default is 0.05).
For more details on these arguments, refer to the spNetwork user guide.
The output of kfunctions() is a list containing:
plotkA: A ggplot2 object representing the k-function values.plotgA: A ggplot2 object representing the g-function values.valuesA: A DataFrame containing the data used to generate the plots.
For example, the ggplot2 object of k-function can be visualized using the following code chunk.
kfun_childcare$plotk
The blue line is the empirical network K-function of the childcare centres in Punggol planning area. The gray envelop represents the results of the 50 simulations in the confidence interval 2.5% - 97.5%. Since the blue line segment between the distance 125m-400m is below the gray area, we can infer that the childcare centres in Punggol planning area resemble regular pattern at the distance of 125m-400m.
Next we plot the g-function.
kfun_childcare$plotg
The blue line is the empirical network G-function of the childcare centres in Punggol planning area. The gray envelop represents the results of the 50 simulations in the confidence interval 2.5% - 97.5%. Since the blue line segment between the distance 125m-150m is below the gray area, we can infer that the childcare centres in Punggol planning area resemble regular pattern at the distance of 125m-150m.